✨ Add --fail-on-diff option and improve SDK resilience #169

Robdel12 · 2026-01-05T08:08:05Z

Summary

Visual diffs no longer break tests by default - SDKs can now choose whether to fail on diffs
Server returns 200 for diffs with status: 'diff' instead of 422 error
Added --fail-on-diff CLI flag for vizzly tdd start and vizzly tdd run

Changes

Server:

Visual diffs return 200 with status: 'diff' instead of 422
Consistent response format with status field: 'match', 'diff', 'new', or 'baseline-updated'

CLI:

--fail-on-diff flag on vizzly tdd start and vizzly tdd run
failOnDiff stored in server.json for SDK discovery

Ember SDK:

Read failOnDiff from server.json, env var, or per-snapshot option
Gracefully skip snapshots when no server running (like Percy)
failOnDiff option on vizzlySnapshot() for per-snapshot control

Usage

# Default: diffs logged as warnings, tests pass
vizzly tdd start

# Fail tests on visual diffs
vizzly tdd start --fail-on-diff

# Or via environment variable
VIZZLY_FAIL_ON_DIFF=true npm test

# Or per-snapshot
await vizzlySnapshot('critical-ui', { failOnDiff: true });

Test plan

Updated tdd-handler.test.js for new 200 response format
Added tests for diff response structure
Added tests for getServerInfo() reading failOnDiff
All CLI tests pass
All Ember SDK tests pass

Visual diffs no longer break tests by default - SDKs can now choose whether to fail on diffs via CLI flag, env var, or per-snapshot option. **Server changes:** - Visual diffs return 200 with `status: 'diff'` instead of 422 - Consistent response format: `status` field is 'match', 'diff', 'new', or 'baseline-updated' - SDKs can decide whether to fail tests based on the status **CLI changes:** - Add `--fail-on-diff` flag to `vizzly tdd start` and `vizzly tdd run` - Store `failOnDiff` in server.json for SDK discovery **Ember SDK changes:** - Read `failOnDiff` from server.json, env var (VIZZLY_FAIL_ON_DIFF), or per-snapshot option - Gracefully skip snapshots when no server is running (like Percy) - Add `failOnDiff` option to `vizzlySnapshot()` for per-snapshot control **Priority order for failOnDiff:** 1. Per-snapshot option: `vizzlySnapshot('name', { failOnDiff: true })` 2. Environment variable: `VIZZLY_FAIL_ON_DIFF=true` 3. CLI flag: `vizzly tdd start --fail-on-diff` 4. Default: false (diffs logged as warnings, tests pass)

claude · 2026-01-05T08:11:00Z

Code Review: Add --fail-on-diff option and improve SDK resilience

Overall, this is a well-designed and well-tested PR that successfully decouples visual diff detection from test failure. The implementation is clean, consistent, and follows good engineering practices.

✅ Strengths

1. Excellent API Design

The priority hierarchy (per-snapshot > env var > server config > default) is intuitive and flexible
Returning 200 with status: 'diff' instead of 422 is the right choice - it's not an error, it's data
Consistent response format across all status types (match, diff, new, baseline-updated)

2. Strong Test Coverage

Comprehensive unit tests for server response formats
Tests for config discovery and priority resolution
Tests for graceful degradation when server isn't running
All existing tests updated to handle new response format

3. Good Documentation

Clear JSDoc comments explaining the priority chain
Helpful examples in the PR description
Code comments explaining the reasoning

4. Backwards Compatibility

Default behavior (diffs don't fail tests) maintains existing workflow
Graceful handling when Vizzly server isn't running (Percy-style behavior)

🔍 Issues & Suggestions

1. Missing Test: Per-Snapshot failOnDiff Override

Severity: Medium

The Ember SDK supports vizzlySnapshot('name', { failOnDiff: true }) but there's no test verifying this actually works. This is a critical feature that should be tested.

Location: clients/ember/src/test-support/index.js:278

2. Inconsistent Error Handling for Missing diffPercentage

Severity: Low

In clients/ember/src/test-support/index.js:282, you use optional chaining for result.diffPercentage?.toFixed(2), but the server always returns diffPercentage when status === 'diff'. The optional chaining suggests the field might be missing, which could hide bugs.

Location: clients/ember/src/test-support/index.js:282

3. Potential Race Condition in Server Discovery Cache

Severity: Low

The cachedServerInfo in snapshot-server.js is cached globally but only cleared in tests. If the TDD server is restarted with different failOnDiff settings during a long-running test session, the cache won't reflect the new value.

Location: clients/ember/src/launcher/snapshot-server.js:37-40

4. Missing Environment Variable Documentation

Severity: Low

The PR adds VIZZLY_FAIL_ON_DIFF environment variable, but it's not documented in CLAUDE.md (which lists other env vars around line 120).

5. Function Name Could Be Clearer

Severity: Very Low (nitpick)

The function shouldFailOnDiff() in test-support/index.js:142 only checks the environment variable, not the full priority chain. Consider renaming to getEnvFailOnDiff() for clarity.

🎯 Code Quality

Excellent:

Clean separation of concerns (server returns data, SDK decides what to do with it)
No breaking changes to existing API
Consistent naming conventions
Proper error messages with actionable guidance

🔒 Security & Performance

No security concerns identified. The changes don't introduce any new attack surfaces or handle sensitive data.

Performance impact: Negligible. Minimal I/O impact and trivial CPU cost for boolean flag checks.

✨ Final Verdict

This is production-ready with minor improvements recommended. The core design is sound, the implementation is clean, and test coverage is good. Address Issue #1 (missing per-snapshot test) before merging, and consider the other suggestions for future polish.

Estimated risk: Low - Changes are well-isolated, backwards-compatible, and thoroughly tested.

Nice work on this feature! The API design (especially the priority chain and the 200-not-422 decision) shows thoughtful engineering.

Fixes #179 - Vitest client was checking for 'passed'/'failed' status values but PR #169 changed the TDD handler to return 'match'/'diff'. **Vitest SDK fix:** - Changed status checks from 'passed' to 'match' - Changed status checks from 'failed' to 'diff' - Added handling for 'baseline-updated' status **CI improvements:** - Added `test-sdk-e2e` job that always runs (not conditional on path changes) - Tests Core JS Client, Vitest, Ember, and Ruby SDKs end-to-end - Removed redundant `test-site` job (now part of test-sdk-e2e) - Shared Playwright browser cache across SDK tests This ensures CLI changes that break SDK integrations are caught before merge.

## Summary - Fix Vitest SDK to use correct status values from TDD handler (`match`/`diff` instead of `passed`/`failed`) - Add `test-sdk-e2e` CI job that always runs to catch CLI changes that break SDK integrations - Remove redundant `test-site` job (consolidated into `test-sdk-e2e`) - Add shared Playwright browser cache across SDK tests for faster CI Fixes #179 ## Changes **Vitest SDK (`clients/vitest/src/setup.js`):** - Changed `'passed'` → `'match'` status check - Changed `'failed'` → `'diff'` status check - Added handling for `'baseline-updated'` status **CI Workflow (`.github/workflows/ci.yml`):** - Added `test-sdk-e2e` job that runs on every PR/push (not conditional on SDK file changes) - Tests Core JS Client, Vitest SDK, Ember SDK, and Ruby SDK end-to-end - Consolidated `test-site` into `test-sdk-e2e` - Shared Playwright browser cache (chromium + firefox) across all SDK tests ## Context PR #169 changed the TDD handler to return `'match'`/`'diff'` instead of `'passed'`/`'failed'`, but the Vitest client wasn't updated. This caused the error "Unknown comparison status: match" reported in #179. The SDK E2E tests would have caught this - they now run on every PR to prevent similar regressions. See #180 for follow-up work to add Storybook and Static-site SDK E2E tests. ## Test plan - [x] CLI lint passes - [x] CLI tests pass (1784/1784) - [x] Vitest client lint passes - [x] Vitest client unit tests pass (12/12) - [x] CI workflow YAML is valid - [ ] CI `test-sdk-e2e` job passes (will verify in this PR)

Robdel12 merged commit f92aec0 into main Jan 5, 2026
23 of 25 checks passed

Robdel12 deleted the rd/ember-bug-fixes branch January 5, 2026 08:17

KaelWD mentioned this pull request Jan 15, 2026

Vitest: Unknown comparison status: match #179

Closed

Robdel12 mentioned this pull request Jan 15, 2026

🧪 Add E2E tests for Storybook and Static-site SDKs #180

Open

Robdel12 mentioned this pull request Jan 15, 2026

🐛 Fix Vitest SDK status values and add SDK E2E tests to CI #181

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

✨ Add --fail-on-diff option and improve SDK resilience #169

✨ Add --fail-on-diff option and improve SDK resilience #169

Uh oh!

Robdel12 commented Jan 5, 2026

Uh oh!

claude bot commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✨ Add --fail-on-diff option and improve SDK resilience #169

✨ Add --fail-on-diff option and improve SDK resilience #169

Uh oh!

Conversation

Robdel12 commented Jan 5, 2026

Summary

Changes

Usage

Test plan

Uh oh!

claude bot commented Jan 5, 2026

Code Review: Add --fail-on-diff option and improve SDK resilience

✅ Strengths

1. Excellent API Design

2. Strong Test Coverage

3. Good Documentation

4. Backwards Compatibility

🔍 Issues & Suggestions

1. Missing Test: Per-Snapshot failOnDiff Override

2. Inconsistent Error Handling for Missing diffPercentage

3. Potential Race Condition in Server Discovery Cache

4. Missing Environment Variable Documentation

5. Function Name Could Be Clearer

🎯 Code Quality

🔒 Security & Performance

✨ Final Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants